candidate variable
Reviews: Exact Combinatorial Optimization with Graph Convolutional Neural Networks
Update following rebuttal: thanks for taking the time to run additional experiments and reporting back! I am generally supportive of the paper and as such have increased my score to 7. I hope the updates about related work will be incorporated if the paper is accepted, as well as additional experiments you found added value. Summary: This paper proposes an imitation learning approach for learning a branching strategy for integer programming. Key to this approach is the use of a graph neural network representation of the integer programs, together with feature engineering. This work differs from other recent learning-to-branch approaches in that the learning task, using imitation, might be simpler than previous ranking or regression formulations, and that the graph neural network can capture structural information of the instance beyond the simple handcrafted features of previous work.
Experts in the Loop: Conditional Variable Selection for Accelerating Post-Silicon Analysis Based on Deep Learning
Liao, Yiwen, Latty, Raphaël, Yang, Bin
Post-silicon validation is one of the most critical processes in modern semiconductor manufacturing. Specifically, correct and deep understanding in test cases of manufactured devices is key to enable post-silicon tuning and debugging. This analysis is typically performed by experienced human experts. However, with the fast development in semiconductor industry, test cases can contain hundreds of variables. The resulting high-dimensionality poses enormous challenges to experts. Thereby, some recent prior works have introduced data-driven variable selection algorithms to tackle these problems and achieved notable success. Nevertheless, for these methods, experts are not involved in training and inference phases, which may lead to bias and inaccuracy due to the lack of prior knowledge. Hence, this work for the first time aims to design a novel conditional variable selection approach while keeping experts in the loop. In this way, we expect that our algorithm can be more efficiently and effectively trained to identify the most critical variables under certain expert knowledge. Extensive experiments on both synthetic and real-world datasets from industry have been conducted and shown the effectiveness of our method.
- North America (0.14)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- Semiconductors & Electronics (1.00)
- Information Technology > Hardware (0.54)
Yordle: An Efficient Imitation Learning for Branch and Bound
Qu, Qingyu, Li, Xijun, Zhou, Yunfan
Combinatorial optimization problems have aroused extensive research interests due to its huge application potential. In practice, there are highly redundant patterns and characteristics during solving the combinatorial optimization problem, which can be captured by machine learning models. Thus, the 2021 NeurIPS Machine Learning for Combinatorial Optimization (ML4CO) competition is proposed with the goal of improving state-of-the-art combinatorial optimization solvers by replacing key heuristic components with machine learning techniques. This work presents our solution and insights gained by team qqy in the dual task of the competition. Our solution is a highly efficient imitation learning framework for performance improvement of Branch and Bound (B&B), named YORDLE. It employs a hybrid sampling method and an efficient data selection method, which not only accelerates the model training but also improves the decision quality during branching variable selection. In our experiments, YORDLE greatly outperforms the baseline algorithm adopted by the competition while requiring significantly less time and amounts of data to train the decision model. Specifically, we use only 1/4 of the amount of data compared to that required for the baseline algorithm, to achieve around 50% higher score than baseline algorithm. The proposed framework YORDLE won the championship of the student leaderboard.
A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study
Ning, Yilin, Li, Siqi, Ong, Marcus Eng Hock, Xie, Feng, Chakraborty, Bibhas, Ting, Daniel Shu Wei, Liu, Nan
Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors, but such 'black box' variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach using the recently developed Shapley variable importance cloud (ShapleyVIC) that accounts for variability across models. Our approach evaluates and visualizes overall variable contributions for in-depth inference and transparent variable selection, and filters out non-significant contributors to simplify model building steps. We derive an ensemble variable ranking from variable contributions, which is easily integrated with an automated and modularized risk score generator, AutoScore, for convenient implementation. In a study of early death or unplanned readmission, ShapleyVIC selected 6 of 41 candidate variables to create a well-performing model, which had similar performance to a 16-variable model from machine-learning-based ranking.
- Asia > Singapore > Central Region > Singapore (0.05)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- Asia > Taiwan (0.04)
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies
Zarpellon, Giulia, Jo, Jason, Lodi, Andrea, Bengio, Yoshua
Branch and Bound (B&B) is the exact tree search method typically used to solve Mixed-Integer Linear Programming problems (MILPs). Learning branching policies for MILP has become an active research area, with most works proposing to imitate the strong branching rule and specialize it to distinct classes of problems. We aim instead at learning a policy that generalizes across heterogeneous MILPs: our main hypothesis is that parameterizing the state of the B&B search tree can significantly aid this type of generalization. We propose a novel imitation learning framework, and introduce new input features and architectures to represent branching. Experiments on MILP benchmark instances clearly show the advantages of incorporating to a baseline model an explicit parameterization of the state of the search tree to modulate the branching decisions. The resulting policy reaches higher accuracy than the baseline, and on average explores smaller B&B trees, while effectively allowing generalization to generic unseen instances.
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- (3 more...)
Efficient Memory Management for GPU-based Deep Learning Systems
Zhang, Junzhe, Yeung, Sai Ho, Shu, Yao, He, Bingsheng, Wang, Wei
GPU (graphics processing unit) has been used for many data-intensive applications. Among them, deep learning systems are one of the most important consumer systems for GPU nowadays. As deep learning applications impose deeper and larger models in order to achieve higher accuracy, memory management becomes an important research topic for deep learning systems, given that GPU has limited memory size. Many approaches have been proposed towards this issue, e.g., model compression and memory swapping. However, they either degrade the model accuracy or require a lot of manual intervention. In this paper, we propose two orthogonal approaches to reduce the memory cost from the system perspective. Our approaches are transparent to the models, and thus do not affect the model accuracy. They are achieved by exploiting the iterative nature of the training algorithm of deep learning to derive the lifetime and read/write order of all variables. With the lifetime semantics, we are able to implement a memory pool with minimal fragments. However, the optimization problem is NP-complete. We propose a heuristic algorithm that reduces up to 13.3% of memory compared with Nvidia's default memory pool with equal time complexity. With the read/write semantics, the variables that are not in use can be swapped out from GPU to CPU to reduce the memory footprint. We propose multiple swapping strategies to automatically decide which variable to swap and when to swap out (in), which reduces the memory cost by up to 34.2% without communication overhead.
- North America > United States > New York > New York County > New York City (0.04)
- South America > Peru > Loreto Department (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
A Theory of Dichotomous Valuation with Applications to Variable Selection
An econometric or statistical model may undergo a marginal gain when a new variable is admitted, and a marginal loss if an existing variable is removed. The value of a variable to the model is quantified by its expected marginal gain and marginal loss. Assuming the equality of opportunity, we derive a few formulas which evaluate the overall performance in potential modeling scenarios. However, the value is not symmetric to marginal gain and marginal loss; thus, we introduce an unbiased solution. Simulation studies show that our new approaches significantly outperform a few practice-used variable selection methods.
- North America > United States > New Jersey > Mercer County > Princeton (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > District of Columbia > Washington (0.04)
- (4 more...)
Learning to Branch in Mixed Integer Programming
Khalil, Elias Boutros (Georgia Institute of Technology) | Bodic, Pierre Le (Georgia Institute of Technology) | Song, Le (Georgia Institute of Technology) | Nemhauser, George (Georgia Institute of Technology) | Dilkina, Bistra (Georgia Institute of Technology)
The design of strategies for branching in Mixed Integer Programming (MIP) is guided by cycles of parameter tuning and offline experimentation on an extremely heterogeneous testbed, using the average performance. Once devised, these strategies (and their parameter settings) are essentially input-agnostic. To address these issues, we propose a machine learning (ML) framework for variable branching in MIP.Our method observes the decisions made by Strong Branching (SB), a time-consuming strategy that produces small search trees, collecting features that characterize the candidate branching variables at each node of the tree. Based on the collected data, we learn an easy-to-evaluate surrogate function that mimics the SB strategy, by means of solving a learning-to-rank problem, common in ML. The learned ranking function is then used for branching. The learning is instance-specific, and is performed on-the-fly while executing a branch-and-bound search to solve the MIP instance. Experiments on benchmark instances indicate that our method produces significantly smaller search trees than existing heuristics, and is competitive with a state-of-the-art commercial solver.
Model Selection Consistency for Cointegrating Regressions
We study the asymptotic properties of the adaptive Lasso in cointegration regressions in the case where all covariates are weakly exogenous. We assume the number of candidate I(1) variables is sub-linear with respect to the sample size (but possibly larger) and the number of candidate I(0) variables is polynomial with respect to the sample size. We show that, under classical conditions used in cointegration analysis, this estimator asymptotically chooses the correct subset of variables in the model and its asymptotic distribution is the same as the distribution of the OLS estimate given the variables in the model were known in beforehand (oracle property). We also derive an algorithm based on the local quadratic approximation and present a numerical study to show the adequacy of the method in finite samples.